AITopics | analysis method

Collaborating Authors

analysis method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Causal Abstractions of Neural Networks

Neural Information Processing SystemsDec-24-2025, 02:47:10 GMT

Structural analysis methods (e.g., probing and feature attribution) are increasingly important tools for neural network analysis. We propose a new structural analysis method grounded in a formal theory of causal abstraction that provides rich characterizations of model-internal representations and their roles in input/output behavior. In this method, neural representations are aligned with variables in interpretable causal models, and then interchange interventions are used to experimentally verify that the neural representations have the causal properties of their aligned variables. We apply this method in a case study to analyze neural models trained on Multiply Quantified Natural Language Inference (MQNLI) corpus, a highly complex NLI dataset that was constructed with a tree-structured natural logic causal model. We discover that a BERT-based model with state-of-the-art performance successfully realizes parts of the natural logic model's causal structure, whereas a simpler baseline model fails to show any such structure, demonstrating that neural representations encode the compositional structure of MQNLI examples.

causal abstraction, name change, neural network, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

Behavioral Fingerprinting of Large Language Models

Pei, Zehua, Zhen, Hui-Ling, Zhang, Ying, Yang, Zhiyuan, Li, Xing, Yu, Xianzhi, Yuan, Mingxuan, Yu, Bei

arXiv.org Artificial IntelligenceSep-8-2025

Current benchmarks for Large Language Models (LLMs) primarily focus on performance metrics, often failing to capture the nuanced behavioral characteristics that differentiate them. This paper introduces a novel ``Behavioral Fingerprinting'' framework designed to move beyond traditional evaluation by creating a multi-faceted profile of a model's intrinsic cognitive and interactive styles. Using a curated \textit{Diagnostic Prompt Suite} and an innovative, automated evaluation pipeline where a powerful LLM acts as an impartial judge, we analyze eighteen models across capability tiers. Our results reveal a critical divergence in the LLM landscape: while core capabilities like abstract and causal reasoning are converging among top models, alignment-related behaviors such as sycophancy and semantic robustness vary dramatically. We further document a cross-model default persona clustering (ISTJ/ESTJ) that likely reflects common alignment incentives. Taken together, this suggests that a model's interactive nature is not an emergent property of its scale or reasoning power, but a direct consequence of specific, and highly variable, developer alignment strategies. Our framework provides a reproducible and scalable methodology for uncovering these deep behavioral differences. Project: https://github.com/JarvisPei/Behavioral-Fingerprinting

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.04504

Country: North America (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Government (1.00)
Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Impact of Artificial Intelligence on Human Thought

Gesnot, Rénald

arXiv.org Artificial IntelligenceAug-26-2025

This research paper examines, from a multidimensional perspective (cognitive, social, ethical, and philosophical), how AI is transforming human thought. It highlights a cognitive offloading effect: the externalization of mental functions to AI can reduce intellectual engagement and weaken critical thinking. On the social level, algorithmic personalization creates filter bubbles that limit the diversity of opinions and can lead to the homogenization of thought and polarization. This research also describes the mechanisms of algorithmic manipulation (exploitation of cognitive biases, automated disinformation, etc.) that amplify AI's power of influence. Finally, the question of potential artificial consciousness is discussed, along with its ethical implications. The report as a whole underscores the risks that AI poses to human intellectual autonomy and creativity, while proposing avenues (education, transparency, governance) to align AI development with the interests of humanity.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2508.16628

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material (1.00)

Industry:

Media > News (1.00)
Leisure & Entertainment (1.00)
Information Technology > Services (1.00)
(10 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(6 more...)

Add feedback

What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training

Kloots, Marianne de Heer, Mohebbi, Hosein, Pouw, Charlotte, Shen, Gaofei, Zuidema, Willem, Bentum, Martijn

arXiv.org Artificial IntelligenceJul-11-2025

How language-specific are speech representations learned by self-supervised models? Existing work has shown that a range of linguistic features can be successfully decoded from end-to-end models trained only on speech recordings. However, it's less clear to what extent pre-training on specific languages improves language-specific linguistic information. Here we test the encoding of Dutch phonetic and lexical information in internal representations of self-supervised Wav2V ec2 models. Pre-training exclusively on Dutch improves the representation of Dutch linguistic features as compared to pre-training on similar amounts of English or larger amounts of multilingual data. This language-specific advantage is well-detected by trained clustering or classification probes, and partially observable using zero-shot metrics.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2025-1526

2506.00981

Country:

Europe (0.28)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

The use of cross validation in the analysis of designed experiments

Weese, Maria L., Smucker, Byran J., Edwards, David J.

arXiv.org Machine LearningJun-18-2025

Cross-validation (CV) is a common method to tune machine learning methods and can be used for model selection in regression as well. Because of the structured nature of small, traditional experimental designs, the literature has warned against using CV in their analysis. The striking increase in the use of machine learning, and thus CV, in the analysis of experimental designs, has led us to empirically study the effectiveness of CV compared to other methods of selecting models in designed experiments, including the little bootstrap. We consider both response surface settings where prediction is of primary interest, as well as screening where factor selection is most important. Overall, we provide evidence that the use of leave-one-out cross-validation (LOOCV) in the analysis of small, structured is often useful. More general $k$-fold CV may also be competitive but its performance is uneven.

artificial intelligence, experiment, machine learning, (16 more...)

arXiv.org Machine Learning

2506.14593

Country:

North America > United States > Michigan > Wayne County > Detroit (0.04)
North America > United States > South Carolina > Charleston County > Charleston (0.04)
North America > United States > Ohio > Butler County > Oxford (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine (1.00)
Materials (0.93)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.84)

Add feedback

Review for NeurIPS paper: Hierarchical nucleation in deep neural networks

Neural Information Processing SystemsJan-24-2025, 12:18:32 GMT

Weaknesses: The primary weaknesses are 1) lack of novelty, 2) concern as to whether the analysis method is advantageous and appropriate for understanding representation learning in CNNs, and 3) lack of convincing evidence that the four stated hypotheses are valid. Moreover, as argued in A.4, the method is also closely-related to CKA [3]. Thus the novelty comes primarily from the specific hypotheses raised by the authors and the methods used to test them. Note that this is not a substantial weakness, in that if the findings were interesting and the evidence was persuasive then acceptance would be merited. The need to pick a certain number of discrete neighbors seems disadvantageous.

hierarchical nucleation, hypothesis, nucleation, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Causal Abstractions of Neural Networks

Neural Information Processing SystemsOct-10-2024, 08:24:09 GMT

causal abstraction, neural network, neural representation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

Human-like Linguistic Biases in Neural Speech Models: Phonetic Categorization and Phonotactic Constraints in Wav2Vec2.0

Kloots, Marianne de Heer, Zuidema, Willem

arXiv.org Artificial IntelligenceJul-3-2024

What do deep neural speech models know about phonology? Existing work has examined the encoding of individual linguistic units such as phonemes in these models. Here we investigate interactions between units. Inspired by classic experiments on human speech perception, we study how Wav2Vec2 resolves phonotactic constraints. We synthesize sounds on an acoustic continuum between /l/ and /r/ and embed them in controlled contexts where only /l/, only /r/, or neither occur in English. Like humans, Wav2Vec2 models show a bias towards the phonotactically admissable category in processing such ambiguous sounds. Using simple measures to analyze model internals on the level of individual stimuli, we find that this bias emerges in early layers of the model's Transformer module. This effect is amplified by ASR finetuning but also present in fully self-supervised models. Our approach demonstrates how controlled stimulus designs can help localize specific linguistic knowledge in neural speech models.

continuum, probability, representation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2024-2490

2407.03005

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward

Xie, Xuan, Song, Jiayang, Zhou, Zhehua, Huang, Yuheng, Song, Da, Ma, Lei

arXiv.org Artificial IntelligenceApr-12-2024

While Large Language Models (LLMs) have seen widespread applications across numerous fields, their limited interpretability poses concerns regarding their safe operations from multiple aspects, e.g., truthfulness, robustness, and fairness. Recent research has started developing quality assurance methods for LLMs, introducing techniques such as offline detector-based or uncertainty estimation methods. However, these approaches predominantly concentrate on post-generation analysis, leaving the online safety analysis for LLMs during the generation phase an unexplored area. To bridge this gap, we conduct in this work a comprehensive evaluation of the effectiveness of existing online safety analysis methods on LLMs. We begin with a pilot study that validates the feasibility of detecting unsafe outputs in the early generation process. Following this, we establish the first publicly available benchmark of online safety analysis for LLMs, including a broad spectrum of methods, models, tasks, datasets, and evaluation metrics. Utilizing this benchmark, we extensively analyze the performance of state-of-the-art online safety analysis methods on both open-source and closed-source LLMs. This analysis reveals the strengths and weaknesses of individual methods and offers valuable insights into selecting the most appropriate method based on specific application scenarios and task requirements. Furthermore, we also explore the potential of using hybridization methods, i.e., combining multiple methods to derive a collective safety conclusion, to enhance the efficacy of online safety analysis for LLMs. Our findings indicate a promising direction for the development of innovative and trustworthy quality assurance methodologies for LLMs, facilitating their reliable deployments across diverse domains.

analysis method, arxiv preprint arxiv, online safety analysis method, (11 more...)

arXiv.org Artificial Intelligence

2404.08517

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Alberta (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Transportation (0.68)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Validation, Robustness, and Accuracy of Perturbation-Based Sensitivity Analysis Methods for Time-Series Deep Learning Models

Wang, Zhengguang

arXiv.org Artificial IntelligenceJan-29-2024

This work undertakes studies to evaluate Interpretability Methods for Time-Series Deep Learning. Sensitivity analysis assesses how input changes affect the output, constituting a key component of interpretation. Among the post-hoc interpretation methods such as back-propagation, perturbation, and approximation, my work will investigate perturbation-based sensitivity Analysis methods on modern Transformer models to benchmark their performances. Specifically, my work answers three research questions: 1) Do different sensitivity analysis (SA) methods yield comparable outputs and attribute importance rankings? 2) Using the same sensitivity analysis method, do different Deep Learning (DL) models impact the output of the sensitivity analysis? 3) How well do the results from sensitivity analysis methods align with the ground truth?

analysis method, research question, sensitivity analysis method, (8 more...)

arXiv.org Artificial Intelligence

2401.16521

Country: North America > United States > Virginia (0.05)

Genre: Research Report > New Finding (0.38)

Industry: Health & Medicine (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback